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Trapezoidal words are finite words having at most n + 1 distinct factors of length n, for every n > 0. 
They encompass finite Sturmian words. We distinguish trapezoidal words into two disjoint subsets: 
open and closed trapezoidal words. A trapezoidal word is closed if its longest repeated prefix has 
exactly two occurrences in the word, the second one being a suffix of the word. Otherwise it is 
open. We show that open trapezoidal words are all primitive and that closed trapezoidal words are 
all Sturmian. We then show that trapezoidal palindromes are closed (and therefore Sturmian). This 
allows us to characterize the special factors of Sturmian palindromes. We end with several open 
problems. 
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1 Introduction 

In combinatorics on words, the most famous class of infinite words is certainly that of Sturmian words. 
Sturmian words code digital straight line in the discrete plane having irrational slope. They are charac- 
terized by the fact that they have exactly n + 1 factors of length n, for every n > 0. 

It is well known (O, Proposition 2.1.17) that a finite word w is a factor of some Sturmian word if 
and only if w is a binary balanced word, that is, there exists a letter a such that for every pair of factors 
of w of the same length, u and v, one has that u and v contain the same number of a\ up to one, i.e., 

||«|a-|v| a | < 1. (1) 

Finite Sturmian words (finite factors of Sturmian words) have the property that they have at most 
n + 1 factors of length n, for every n > 0. However, this property does not characterize them, as shown 
by the word w = aaabab, which is not Sturmian since the factors aaa and bob do not verify ([I]). 

The set of finite words defined by the property that they have at most n + 1 factors of length n, for 
every n > 0, is called the set of trapezoidal words. 

The name comes from the fact that the graph of the complexity function of these word^j] defines a 
regular trapezium. Trapezoidal words have been defined by de Luca, who observed that Sturmian words 
are trapezoidal [7]. The non-Sturmian trapezoidal words have been then characterized by D'Alessandro 
0. 

In this paper, we distinguish trapezoidal words into two distinct classes, accordingly with the defini- 
tion below. 

Definition 1. Let w be a finite word over an alphabet E. We say that w is closed if the longest repeated 
prefix ofw has exactly two occurrences in w, the second one being a suffix ofw. 
A word which is not closed is called open. 

'The complexity function of a word w is the function that counts the number of distinct factors of each length in w. 
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For example, the word aabbaa is closed, whereas the word aabbaaa is open. 

Remark 1. The notion of closed word is equivalent to that of periodic-like word [2 ]. A word w is 
periodic-like if its longest repeated prefix does not appear in w followed by different letters. 

The notion of closed word is also related to the concept of complete return to a factor u in a word w, 
as considered in /O/. A complete return to u in w is any factor ofw having exactly two occurrences ofu, 
one as a prefix and one as a suffix. Therefore, w is a closed word if and only ifw is a complete return to 
its longest repeated prefix. 

In this paper, we distinguish trapezoidal words in open and closed. This allows us to establish 
some further properties of trapezoidal words. More precisely, we have that open trapezoidal words 



are all primitive (Lemma [T3| ), while closed trapezoidal words are all Sturmian (Proposition [T2| ). We 
characterize open trapezoidal words by means of their special factors (Proposition [TO]) and show that the 
longest special factor of a closed trapezoidal word is a central word (Lemma [T4|). We then show that 
trapezoidal palindromes are closed (Theorem [T6|). This allows us to characterize the special factors of 
Sturmian palindromes (Corollary [T7]). 



2 Trapezoidal Words 

An alphabet, denoted by E, is a finite set of symbols. A word over £ is a finite sequence of symbols from 
E. We denote by alph(w) the subset of the alphabet E constituted by the letters appearing in w. 

The length of a word w is denoted by \w\ and is the number of its symbols. We denote by w\ the /-th 
letter of w. For a letter a £ I, we denote by \w\ a the number of a's appearing in w. 

The set of all words over E is denoted by E*. The set of all words over E having length n is denoted 
by E". The empty word has length zero and is denoted by £. 

Let w = a\ai ■ ■ ■ a n , n > 0, be a non-empty word over the alphabet E. The word w = a n a n - \ ■ • • a\ is 
called the reversal of w. A palindrome is a word w such that w = w. 

A prefix of w is any word v such that v = e or v is of the form v = a\02 ■ ■ ■ a^, with 1 < i < n. A suffix 
of w is any word v such that v = e or v is of the form v = • • • a n , with 1 < i < n. A factor of w is a 
prefix of a suffix of w (or, equivalently, a suffix of a prefix of w). Therefore, a factor of w is any word v 
such that v = £ or v is of the form v = a,-a,-+i • • • aj, with 1 < i < j < n. A factor of a word w is internal 
if it is not a prefix nor a suffix of w. 

We denote by Prefiw), Suffiw) and Fact(w), respectively, the set of prefixes, suffixes and factors of 
the word w. 

The factor complexity of a word w is the function defined by f w {n) = \Fact(w) nE"|, for every n > 0. 
Notice that f w (l) is the number of distinct letters occurring in w. A binary word is a word w such that 

f w {\) = \alph{w)\=2. 

A factor u of w is left special if there exist a,b £ E, a ^ b, such that au,bu G Fact(w). A factor u of 
w is right special if there exist a,b G E, a ^ b, such that ua,ub G Fact{w). A factor u of w is bispecial if 
it is both left and right special. 

For example, let w = aabbb. The left special factors of w are e, b and bb. The right special factors 
of w are e and a. Therefore, the only bispecial factor of w is e. 

A period for the word w is a positive integer p, with < p < \w\, such that w,- = w, +p for every 
i = 1, . . . , |w| — p. Since \w\ is always a period for w, we have that every non-empty word has at least 
one period. We can unambiguously define the period of the word w as the smallest of its periods. For 
example the period of w = aabaaba is 3. 
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The fractional root z w of a word w is its prefix whose length is equal to the period of w. So for 
example the fractional root of w = aabaaba is z w = aab. 

A word w is a power if there exists a non-empty word u and an integer n > 1 such that w = u n . A 
word which is not a power is called primitive. 

The following parameters have been introduced by de Luca Q : 

Definition 2. Let w be a word over Z. We denote by H w the minimal length of a prefix ofw which occurs 
only once in w. We denote by K w the minimal length of a suffix ofw which occurs only once in w. 

Definition 3. Let w be a word over £. We denote by L w the minimal length for which there are no left 
special factors of that length in w. Analogously, we denote by R w the minimal length for which there are 
no right special factors of that length in w. 

Example 1. Let w = aaababa. The longest left special factor ofw is aba, and it is also the longest 
repeated suffix ofw; the longest right special factor ofw is aa, and it is also the longest repeated prefix 
ofw. Thus, we have L w =4, K w = 4, R w = 3 and H w = 3. 

Notice that for every word w such that \alph(w)\ > 1, the values H W ,K W ,L W and R w are positive 
integers. Moreover, one has f w {R w ) = f w {L w ) and maxl^,^} = max{L w ,H w } (0, Corollary 4.1). 
The following proposition is from de Luca ([7], Proposition 4.2). 

Proposition 1. Let w be a word of length \w\ such that \alph(w)\ > 1 and set m w = minl^y,,^} and 
M w = max{R w ,K w }. The factor complexity f w is strictly increasing in the interval [0,m w ], is nonde- 
creasing in the interval [m w ,M w ] and strictly decreasing in the interval [M w , \ w\]. Moreover, for i in the 
interval [M w , \w\], one has f w (i+ 1) = fw(i) — 1- IfR w < K w , then f w is constant in the interval [m w ,M w ]. 

Proposition [T] allows one to give the following definition. 

Definition 4. A non-empty word w is trapezoidal if 

• fw(i) = i + 1 f or < / < m w , 

• fw(i + 1) = fw(i) far m w < i < M w - 1, 

• f w {i + l)=f w (i)-l far M w <i<\w\. 

Trapezoidal words have been considered for the first time by de Luca [7 ]. The name trapezoidal has 
been given by D'Alessandro [4]. The choice of the name is motivated by the fact that for these words the 
graph of the complexity function defines a regular trapezium (possibly degenerated in a triangle). 

Notice that by definition a trapezoidal word is a binary word. 

Example 2. The word w = aaababa considered in Example^is trapezoidal. See Fig. [7] 
In the following proposition we gather some characterizations of trapezoidal words. 
Proposition 2. Let w be a binary word. The following conditions are equivalent: 

1. w is trapezoidal; 

2. \w\ =L W +H W ; 

3. \w\ =R W +K W ; 

4. w has at most one left special factor of length nfor every n > 0; 

5. w has at most one right special factor of length nfor every n > 0; 

6. w has at most n + 1 distinct factors of length nfor every n > 0; 



132 



A Classification of Trapezoidal Words 





























,/ 


w (n) 


L 


I 




















































































































i 










































. 




























































































































> 






































































































































































( 


) 

( 




































) 








> 




i 






1 




f 






7 


-> 








































n 































Figure 1 : The graph of the complexity function of the trapezoidal word w = aaabciba. One has m w = 
min{R w ,K w } = 3 and M w = max{R w , K w } = 4. 



7. \f w (n+ 1) -/ w (n)| < 1/or every n > 0. 

Proo/ The equivalence of (1), (2), (3), (4) and (5) is in and E). 

The equivalence of (5) and (6) follows from elementary considerations on the factorial complexity 
of binary words. Indeed, it is easy to see the number of distinct factors of length n + 1 of a binary word 
w is at most equal to the number of distinct factors of length n plus the number of right special factors of 
length n. 

The equivalence of (7) and (1) comes directly from the definitions and from Proposition [I] 

The proof is therefore complete. □ 

Recall that a finite word is Sturmian if and only if it is balanced, i.e., verifies ([T]). The following 
proposition is from de Luca ([7], Proposition 7.1). 

Proposition 3. Let w be a binary word. Ifw is Sturmian, then w is trapezoidal. 

The inclusion in Proposition [3] is strict, since there exist trapezoidal words that are not Sturmian, e.g. 
the word w = aaababa considered in Example [T] 

Recall that a binary word w is rich {ox full) (5 ] if it contains \w\ + 1 distinct palindromic factors, that 
is the maximum number of distinct palindromic factors a word can contain. 

The following proposition is from de Luca, Glen and Zamboni ( jH) , Proposition 2). 

Proposition 4. Let w be a binary word. Ifw is trapezoidal, then w is rich. 

Again, the inclusion in Proposition |4] is strict, since there exist rich words that are not trapezoidal, 
e.g. the word w = aabbaa. 

D'Alessandro [4] characterized the non-Sturmian trapezoidal words. We report below his character- 
ization. 

First, recall that a word w is unbalanced (i.e., w is not balanced) if and only if it contains a pair of 
pathological factors (f,g), that is, / and g are factors of w of the same length but they do not verify ([!]). 
Moreover, if / and g are chosen of minimal length, there exists a palindrome u such that / = aua and 
g = bub, for two different letters a and b (see [6], Proposition 2.1.3). 
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We can also state that / and g are Sturmian words, since otherwise they would contain a pair of 
pathological factors shorter than \ f\ = \g\ and hence w would contain such a pair of pathological factors, 
against the minimality of / and g. So the word u is a palindrome such that aua and bub are Sturmian 
words, i.e., u is a central word lfl2l . 

The following lemma is attributed to Aldo de Luca in HI . 
Lemma 5. Let w be a non-Sturmian word and (f,g) the pair of pathological factors of w of minimal 
length. Then f and g do not overlap in w. 

The following is the characterization of trapezoidal non-Sturmian words given by D'Alessandro 0]. 
Theorem 6. Let w be a binary non-Sturmian word. Then w is trapezoidal if and only if 

w = pq, with p G Suff({z,}}), q£Pref({z*}) 

where Zf is the mirror image of the fractional root Zf off, z g is the fractional root of g, with (f,g) being 
the pair of pathological factors ofw of minimal length. 

In particular, K w = \q\ and the longest right special factor ofw is the prefix ofw of length R w — \. 
Example 3. Let w = aaababa be the non-Sturmian trapezoidal word considered in Example^ We have 
f = aaa and g = bab, so that Zf = a and z. g = ba. The word w factoriz.es as w = pq, with p = aaa and 
q = baba. 

Hence, trapezoidal words are either Sturmian or of the form described in Theorem [6j The following 
result of de Luca, Glen and Zamboni states that trapezoidal palindromes are all Sturmian. 
Theorem 7 (HH). The following conditions are equivalent: 

1. w is a trapezoidal palindrome; 

2. w is a Sturmian palindrome. 

Let us give a proof of this latter result based on Theorem [6] We first show that the words p and q in 
the factorization of Theorem [6] are Sturmian words. 

Lemma 8. Let w be a trapezoidal non-Sturmian word and let w = pq, with p G Suff({z*f}) and q € 
Pref({z*g}), be the factorization of Theorem^ Then p and q are Sturmian words. 

Proof. Recall that any central word u that is not a power of a single letter can be uniquely written as 
u = w\xyw2 = W2yxw\, for two central words wi,W2 and different letters x,y (see [3], Proposition 1). 

Let u be the central word such that / = aua and g = bub. If u is not a power of a single letter, 
the fractional roots of / and g are xwiy and yw\x (H, Lemma 2). This implies that If and z g both are 
conjugate to standard Sturmian word^] If u = x n , x G E, n > 0, then the fractional roots of / and g are x 
and yx", so even in this case Zf and z, g both are conjugate to standard Sturmian words. 

By Theorem 1 in iflOl . any word whose fractional root is conjugate to a standard Sturmian word is 
a Sturmian word. This implies that every word belonging to Zj or to z* is Sturmian. Since a factor of a 
Sturmian word is a Sturmian word, the claim follows. □ 

Now, let w be a trapezoidal palindrome. If w is non-Sturmian we can write, by Theorem |6j w = 
pq = v\fgvi = vigfv\, with vi,V2 G {a,b}* such that p = v\f and q = gV2- If |vi| = | V2 1 > then f = g, 
a contradiction. If |vi| ^ | V2 1 then either / and g overlap (a contradiction with Lemma [5]> or p (or q) 
contains / and g as factors, a contradiction since, by Lemma[8j p and q are Sturmian. So w cannot be a 
non-Sturmian word. 

Hence, we proved that trapezoidal palindromes are Sturmian. Since by Proposition [3] Sturmian 
words are trapezoidal, the claim of Theorem[7]follows. 

2 Standard Sturmian words are words of length one or of the form uxy, with u central word and x,y different letters 1 12|. 



134 



A Classification of Trapezoidal Words 



3 Open and Closed Trapezoidal Words 

In this section we derive some properties of open and closed trapezoidal words. 

Proposition 9. Let w be a trapezoidal word. Then w is open ( resp. closed) if and only ifw is open ( resp. 
closed). 

Proof. If w is a closed trapezoidal word, then its longest repeated prefix is also its longest repeated suffix 
and has exactly two occurrences in w. This implies that w has the same property. So w is a closed 
trapezoidal word. Hence the set of closed trapezoidal words is closed by reversal. 

Since the whole set of trapezoidal words is closed by reversal (H, Corollary 7) and since open 
trapezoidal words form the complement of closed trapezoidal words in the set of trapezoidal words, the 
set of open trapezoidal words is closed by reversal too. □ 

The following proposition gives a characterization of open trapezoidal words. 
Proposition 10. Let w be a trapezoidal word. Then the following conditions are equivalent: 

1. w is open; 

2. the longest repeated prefix ofw is also the longest right special factor ofw; 

3. the longest repeated suffix ofw is also the longest left special factor ofw. 

Proof. (1) => (2). Let h be the longest repeated prefix of w and x the letter such that hx is a prefix of 
w. Since w is not closed, h has a second non-suffix occurrence in w followed by letter y. Since h is the 
longest repeated prefix of w, we have y ^ x. Therefore, h is right special in w. 

Suppose that w has a right special factor r longer than h. Since w is trapezoidal, w has at most one 
right special factor for each length (Proposition [2]). Since the suffixes of a right special factor are right 
special factors, we have that h must be a proper suffix of r. Since r is right special in w, it has at least two 
occurrences in w followed by different letters. This implies a non-prefix occurrence of hx in w, against 
the definition of h. 

(2) =^ (3). Let k be the longest repeated suffix of w. We first prove that k is left special in w. 
Otherwise, k appears in w exactly twice, once as a prefix and once as a suffix of w. This implies that 
k = h, the longest repeated prefix of w - a contradiction, since by hypothesis h is right special in w and 
therefore it has at least a non-suffix occurrence. 

It remains to prove that k is the longest left special factor of w. Since w is trapezoidal, we have, by 
Proposition^ \w\ = L w +H W = R w + K w . Since by hypothesis the longest repeated prefix of w is also the 
longest right special factor of w, we have H w = R w and therefore L w = K w . Thus, the longest left special 
factor of w has length equal to L w — 1 = K w — 1 = \k\. 

(3) => (1). Let k be the longest repeated suffix of w and let ak and bk be factors of w, for different 
letters a and b. Then we have that k is the longest repeated prefix of w, and ka and kb are factors of w. 
This proves that the word w is open. The claim then follows from Proposition [9] □ 

Lemma 11. Let w be a trapezoidal word. Ifw is open, then H w = R w and K w = L w . Ifw is closed, then 
H w = K w and L w = R w . 



Proof. The claim for open trapezoidal words follows directly from Proposition 10 

Suppose that w is a closed trapezoidal word. Then H w = K w (since w is closed) and therefore L w = R w 
(since w is trapezoidal). □ 



G. Fici 



135 



Open trapezoidal words can be Sturmian (e.g. w = aaabaa) or not (e.g. w = aaabab). Closed trape- 
zoidal words, instead, are always Sturmian, as shown in the following proposition. 

Proposition 12. Let w be a trapezoidal word. Ifw is closed, then w is Sturmian. 

Proof. Suppose that w is not Sturmian. Then, by Theorem [6j w = pq, p E Suff[{zf}), q G Pref({z* g }), 
with (f,g) being the pair of pathological factors of w of minimal length, K w = \q\ (and hence R w = \p\) 
and the longest right special factor of w is the prefix of w of length R w — 1 . 

Since w is closed, we have, by Lemma [TT] H w = K w . So the suffix k of q of length K w — 1 is also the 
longest repeated prefix of w and appears in w only as a prefix and as a suffix. 

If R w > K w , then k is a prefix of the longest right special factor of w. This is a contradiction with the 
fact that k appears in w only as a prefix and as a suffix. 

If R w < K w , then p is a prefix of k and hence p is a factor of q. This implies that / is a factor of q, 
and therefore q contains both / and g as factors. Hence p would be non-Sturmian, a contradiction with 
Lemma [8] □ 



The result stated in Proposition 12 can also be found in a paper of Bucci, de Luca and De Luca ((T), 
Proposition 3.6). 



As a corollary of Proposition 12 we have that every trapezoidal word is open or Sturmian. We 
therefore propose the following 

Problem 1. Give a characterization of open Sturmian words. 

Trapezoidal words, as well as Sturmian words, can be primitive (e.g. w = aabaaa) or not (e.g. w = 
aabaab). Open trapezoidal words (and in particular, then, non-Sturmian trapezoidal words) are always 
primitive. 

Lemma 13. Every open trapezoidal word is primitive. 

Proof. Suppose that w is not primitive. Let w = u n , for a non-empty primitive word u and an integer 
n > 1 . The longest repeated prefix of w is therefore u n ~ 1 , which is also a suffix of w. Moreover, by 
elementary combinatorics on words, m" -1 cannot have internal occurrences in w. Hence w is closed. □ 



The converse of Lemma 13 does not hold. Indeed, there exist trapezoidal words that are primitive 
but not open, e.g. w = aabaa. 

We now focus on closed trapezoidal words and their special factors. 

Lemma 14. Let w be a closed trapezoidal word and let u be the longest left special factor of w. Then 
u is also the longest right special factor of w (and thus u is a bispecial factor of w). Moreover, u is a 
central word. 



Proof. Let u be the longest left special factor of w. Hence there exist different letters a,b £ £ such that 
au and bu are factors of w. 

We claim that both au and bu occur in w followed by some letter. Indeed, suppose the contrary. Then 
one of the words au and bu, say au, appears in w only as a suffix. Let k be the longest repeated suffix of 
w. Since u is a repeated suffix of w, we have \k\ > \u\. If \k\ = \u\, then k = u and since w is closed, u 
appears in w only as a prefix and as a suffix, against the hypothesis that bu is a factor of w. So \k\ > \u\ 
and therefore au must be a suffix of k. This implies an internal occurrence of au in w, a contradiction. 

So, there exist letters x,y such that aux and buy are factors of w. Now, we must have x^y, since 
otherwise ux would be a left special factor of w longer than u. Thus u is right special in w. Since w is 
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closed, we have, by Lemma [TTj H w = K w and L w = R w , so w cannot contain a right special factor longer 



than u. Thus u is the longest right special factor of w. 



By Proposition 12 w is Sturmian. Since u is a bispecial factor of w, and since a factor of a Sturmian 
word is a Sturmian word, in order to prove that u is a central word it is sufficient to prove that u is a 
palindrome. Suppose the contrary. So there exists a prefix z of u and a letter a 6 £ such that za is a prefix 
of if and £>z is a suffix of u, for a letter b different from a. Since m is bispecial in w, this implies that 
aza and bzb are both factors of w. This implies that w is not balanced, a contradiction with Proposition 

□ 



By Theorem [7J trapezoidal palindromes coincide with Sturmian palindromes. Deep and interesting 
results on Sturmian palindromes can be found in [8] and [9 ]. In particular, we want to recall the following 
Theorem 15 (|9], Theorem 29). A palindrome w E Z* is Sturmian if and only if 7i w =R W + 1. 

The next theorem shows that Sturmian palindromes are all closed words. 
Theorem 16. Let w be a trapezoidal ( Sturmian ) palindrome. Then w is closed. 

Proof. By contradiction, suppose that w is open. Then h, the longest repeated prefix of w, is also the 
longest right special factor of w (Proposition [TT)[ ). Since w is a palindrome, we have that the longest 
repeated suffix of w is h, the reversal of h. In particular, then, K w = H w . 



By Lemma 11 H w = R w and K w = L w . Thus we have H w = R w = K w = L w = \w\/2, since w is 
trapezoidal (see Proposition]^. It follows that w = hxxh, for a letter x G £. 



By Theorem 15 the period of w is R w + 1 = \hxx\, so we have h = h. Therefore, we have w = hxxh. 

Since h is right special in w, there exists a letter y ^ x such that hy is a factor of w. Now, any 
occurrence of hy in w cannot be preceded by the letter x, since h = h is the longest repeated suffix of w. 
Thus w contains the factor yhy. 

Hence, w contains both hxx and yhy as factors, and this contradicts the fact that w is Sturmian. □ 

Remark 2. The equivalence between trapezoidal palindromes and Sturmian palindromes (Theorem^ 
can also be derived as a consequence of Theorem\16\ Proposition 12 and Proposition^ 



From Theorem [16] and Lemma [14} we derive the following characterization of the special factors of 
Sturmian palindromes. 

Corollary 17. Let w be a trapezoidal (Sturmian) palindrome. Then the longest left special factor ofw is 
also the longest right special factor ofw and it is a central word. 

Example 4. Let w = aababaa. The longest left special factor ofw is aba, which is also its longest right 
special factor and is a central word. 



4 Conclusions and Open Problems 

In this paper we distinguished trapezoidal words into two disjoint classes: open and closed. We derived 
some combinatorial and structural properties of these two classes of words. 

Many further development directions can arise. For example, a challenging problem could be that 
of finding a characterization of open Sturmian words, that is, of Sturmian words for which the longest 
repeated prefix is also the longest right special factor (Problem[T]). 

Another interesting problem concerns enumeration. Enumeration formulae for Sturmian words lfT3l 
and for primitive Sturmian words ifTUl are known. To the best of our knowledge, an enumeration formula 
for trapezoidal words is not yet known. A possible direction for finding it could be enumerating open 
and closed trapezoidal words separately. 
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